Potter's Wheel: An Interactive Data Cleaning System

نویسندگان

  • Vijayshankar Raman
  • Joseph M. Hellerstein
چکیده

Cleaning data of errors in structure and content is important for data warehousing and integration. Current solutions for data cleaning involve many iterations of data “auditing” to find errors, and long-running transformations to fix them. Users need to endure long waits, and often write complex transformation scripts. We present Potter’s Wheel, an interactive data cleaning system that tightly integrates transformation and discrepancy detection. Users gradually build transformations to clean the data by adding or undoing transforms on a spreadsheet-like interface; the effect of a transform is shown at once on records visible on screen. These transforms are specified either through simple graphical operations, or by showing the desired effects on example data values. In the background, Potter’s Wheel automatically infers structures for data values in terms of user-defined domains, and accordingly checks for constraint violations. Thus users can gradually build a transformation as discrepancies are found, and clean the data without writing complex programs or enduring long delays.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Potter ' s Wheel : An Interactive Framework for Data Cleaning and Transformation

Real world data often has discrepancies in structure and content. Traditional methods for \cleaning" the data involve many iterations of time-consuming \data quality" analysis to nd discrepancies, and long-running transformations to x them. This process requires users to endure long waits and often write complex transformation programs. We present an interactive framework for data cleaning that...

متن کامل

Estimation of Prooles of Sherds of Archaeological Pottery

In this paper, a method for a proole estimation of an archaeological pottery based on their fragments (sherds) is presented. Since investigated pots were made on a potter's wheel, the rotational symmetry of the original objects is assumed. In addition, sherds are oriented before the estimation. Using these constraints, an acquisition method based on a model of a sherd is proposed. The method is...

متن کامل

Using Accelerometers to Command a Cleaning Service Robot

This work studies the effectiveness of using accelerometers to control a service robot. Two modes are proposed, a steering wheel and a movement identification mode. The validation platform is an autonomous cleaning service robot that still needs a local Human Robot Interface. For convenience, accelerometer readings are obtained from the remote command of the Nintendo Wii console. The implementa...

متن کامل

An Interactive Framework for Data Cleaning

Cleaning organizational data of discrepancies in structure and content is important for data warehousing and Enterprise Data Integration (EDI). Current commercial solutions for data cleaning involve many iterations of time-consuming “data quality” analysis to find errors, and long-running transformations to fix them. Users need to endure long waits and often write complex transformation program...

متن کامل

The Design of Cleaning Robot Based on ARM Microprocessor

This paper describes the design of household intelligent cleaning robot, which is an innovative work of collegiate contest. This paper describes the general structure of system, hardware circuit design and software program design. The robot can achieve autonomous movement, garbage cleaning, obstacle avoidance and other functions. It can work in the home, library, exhibition halls and other indo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001